58 research outputs found
Guided Proofreading of Automatic Segmentations for Connectomics
Automatic cell image segmentation methods in connectomics produce merge and
split errors, which require correction through proofreading. Previous research
has identified the visual search for these errors as the bottleneck in
interactive proofreading. To aid error correction, we develop two classifiers
that automatically recommend candidate merges and splits to the user. These
classifiers use a convolutional neural network (CNN) that has been trained with
errors in automatic segmentations against expert-labeled ground truth. Our
classifiers detect potentially-erroneous regions by considering a large context
region around a segmentation boundary. Corrections can then be performed by a
user with yes/no decisions, which reduces variation of information 7.5x faster
than previous proofreading methods. We also present a fully-automatic mode that
uses a probability threshold to make merge/split decisions. Extensive
experiments using the automatic approach and comparing performance of novice
and expert users demonstrate that our method performs favorably against
state-of-the-art proofreading methods on different connectomics datasets.Comment: Supplemental material available at
http://rhoana.org/guidedproofreading/supplemental.pd
Content-adaptive lenticular prints
Lenticular prints are a popular medium for producing automultiscopic glasses-free 3D images. The light field emitted by such prints has a fixed spatial and angular resolution. We increase both perceived angular and spatial resolution by modifying the lenslet array to better match the content of a given light field. Our optimization algorithm analyzes the input light field and computes an optimal lenslet size, shape, and arrangement that best matches the input light field given a set of output parameters. The resulting emitted light field shows higher detail and smoother motion parallax compared to fixed-size lens arrays. We demonstrate our technique using rendered simulations and by 3D printing lens arrays, and we validate our approach in simulation with a user study
Semantic Attention Flow Fields for Monocular Dynamic Scene Decomposition
From video, we reconstruct a neural volume that captures time-varying color,
density, scene flow, semantics, and attention information. The semantics and
attention let us identify salient foreground objects separately from the
background across spacetime. To mitigate low resolution semantic and attention
features, we compute pyramids that trade detail with whole-image context. After
optimization, we perform a saliency-aware clustering to decompose the scene. To
evaluate real-world scenes, we annotate object masks in the NVIDIA Dynamic
Scene and DyCheck datasets. We demonstrate that this method can decompose
dynamic scenes in an unsupervised way with competitive performance to a
supervised method, and that it improves foreground/background segmentation over
recent static/dynamic split methods. Project Webpage:
https://visual.cs.brown.edu/saffComment: International Conference on Computer Vision (ICCV) 2023; 10 pages, 8
figures, 3 table
'Tax-free' 3DMM Conditional Face Generation
3DMM conditioned face generation has gained traction due to its well-defined
controllability; however, the trade-off is lower sample quality: Previous works
such as DiscoFaceGAN and 3D-FM GAN show a significant FID gap compared to the
unconditional StyleGAN, suggesting that there is a quality tax to pay for
controllability. In this paper, we challenge the assumption that quality and
controllability cannot coexist. To pinpoint the previous issues, we
mathematically formalize the problem of 3DMM conditioned face generation. Then,
we devise simple solutions to the problem under our proposed framework. This
results in a new model that effectively removes the quality tax between 3DMM
conditioned face GANs and the unconditional StyleGAN.Comment: Accepted to the AI for Content Creation Workshop at CVPR 202
Consistent Video Filtering for Camera Arrays
International audienceVisual formats have advanced beyond single-view images and videos: 3D movies are commonplace, researchers have developed multi-view navigation systems, and VR is helping to push light field cameras to mass market. However, editing tools for these media are still nascent, and even simple filtering operations like color correction or stylization are problematic: naively applying image filters per frame or per view rarely produces satisfying results due to time and space inconsistencies. Our method preserves and stabilizes filter effects while being agnostic to the inner working of the filter. It captures filter effects in the gradient domain, then uses \emph{input} frame gradients as a reference to impose temporal and spatial consistency. Our least-squares formulation adds minimal overhead compared to naive data processing. Further, when filter cost is high, we introduce a filter transfer strategy that reduces the number of per-frame filtering computations by an order of magnitude, with only a small reduction in visual quality. We demonstrate our algorithm on several camera array formats including stereo videos, light fields, and wide baselines
TöRF: Time-of-Flight Radiance Fields for Dynamic Scene View Synthesis
Neural networks can represent and accurately reconstruct radiance fields for
static 3D scenes (e.g., NeRF). Several works extend these to dynamic scenes
captured with monocular video, with promising performance. However, the
monocular setting is known to be an under-constrained problem, and so methods
rely on data-driven priors for reconstructing dynamic content. We replace these
priors with measurements from a time-of-flight (ToF) camera, and introduce a
neural representation based on an image formation model for continuous-wave ToF
cameras. Instead of working with processed depth maps, we model the raw ToF
sensor measurements to improve reconstruction quality and avoid issues with low
reflectance regions, multi-path interference, and a sensor's limited
unambiguous depth range. We show that this approach improves robustness of
dynamic scene reconstruction to erroneous calibration and large motions, and
discuss the benefits and limitations of integrating RGB+ToF sensors that are
now available on modern smartphones.Comment: Accepted to NeurIPS 2021. Web page: https://imaging.cs.cmu.edu/torf/
NeurIPS camera ready updates -- added quantitative comparisons to new
methods, visual side-by-side comparisons performed on larger baseline camera
sequence
- …